Transfer Learning for Multiagent Reinforcement Learning Systems
نویسندگان
چکیده
Reinforcement learning methods have successfully been applied to build autonomous agents that solve many sequential decision making problems. However, agents need a long time to learn a suitable policy, specially when multiple autonomous agents are in the environment. This research aims to propose a Transfer Learning (TL) framework to accelerate learning by exploiting two knowledge sources: (i) previously learned tasks; and (ii) advising from a more experienced agent. The definition of such framework requires answering several challenging research questions, including: How to abstract and represent knowledge, in order to allow generalization and posterior reuse?, How and when to transfer and receive knowledge in an efficient manner?, and How to evaluate the transfer quality in a Multiagent scenario?. 1 Context and Motivation Reinforcement Learning (RL) [Sutton and Barto, 1998] is an extensively used technique for autonomous agents with the ability to learn through experimentation. First an action that affects the environment is chosen, then the agent observes how much that action collaborated to the task completion through a reward function. An agent can learn how to optimally solve tasks by executing this procedure multiple times. The main limitation of RL is that agents take a long time to learn how to solve tasks. However, like in human learning, previous knowledge can greatly accelerate the learning of a new task. For example, it is easier to learn Spanish beforehand knowing Portuguese (or a similar language). Many RL domains can be treated as Multiagent Systems (MAS), in which multiple agents are acting in a shared environment. We are specially interested in Cooperative Multiagent RL (MARL), in which all agents work cooperatively to solve the same task. In such domains, other type of knowledge reuse is applicable. Agents can communicate to transfer learned behaviors. In the language learning example, being ⇤This research is supported by CNPq (grant 311608/2014-0) and São Paulo Research Foundation (FAPESP), grant 2015/16310-4 taught by a fluent speaker of the desired language can accelerate learning, because the teacher can identify learner’s mistakes and provide customized explanations and examples. However, learning how to actuate in a MAS may be a difficult task, since the environment becomes non-stationary due to the parallel actuation of multiple agents. Transfer Learning (TL) [Taylor and Stone, 2009] allows to reuse knowledge acquired in previous tasks, and has been used to accelerate learning in RL domains and alleviate scalability issues. In MARL, TL can either reuse knowledge from previously learned tasks or from agent communication, in which one agent can transfer learned behaviors to another agent. Even though TL has been used in many ways in MARL, there is no consensual answer to many aspects that must be defined in order to specify a TL algorithm. This research aims to specify a TL framework to allow knowledge reuse in multiagent domains from both previously learned tasks (when available) and agent tutoring, two scenarios that are common in human learning. 2 Research Goals and Expected Contributions This research aims to propose a Transfer Learning framework to allow knowledge reuse in Multiagent Reinforcement Learning, both from previous tasks and among agents. Specifying such method requires the definition of: (i) A model which allows knowledge generalization; (ii) What information is transferred through tasks or agents; and (iii) How to define when the knowledge of a given agent must be transferred to another. Figure 1 depicts the proposed framework. The agent extracts knowledge from advice given by other agents and previously solved tasks to accelerate the learning of a new task. The solution of this new task can then be abstracted and added to the knowledge base. 3 Background and Related Work Single-agent RL domains are usually modeled as a Markov Decision Process (MDP), which can be solved by RL. An MDP is described by the tuple hS,A, T,Ri [Puterman, 2005], where S is the set of environment states, A is the set of actions available to an agent, T is the transition function, and R is the reward function, which gives a feedback toward task completion. At each decision step, an agent observes the state s and chooses an action a (among the applicable ones Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence (IJCAI-16)
منابع مشابه
A Multiagent Reinforcement Learning algorithm to solve the Community Detection Problem
Community detection is a challenging optimization problem that consists of searching for communities that belong to a network under the assumption that the nodes of the same community share properties that enable the detection of new characteristics or functional relationships in the network. Although there are many algorithms developed for community detection, most of them are unsuitable when ...
متن کاملTransfer Learning Method Using Ontology for Heterogeneous Multi-agent Reinforcement Learning
This paper presents a framework, called the knowledge co-creation framework (KCF), for heterogeneous multiagent robot systems that use a transfer learning method. A multiagent robot system (MARS) that utilizes reinforcement learning and a transfer learning method has recently been studied in realworld situations. In MARS, autonomous agents obtain behavior autonomously through multi-agent reinfo...
متن کاملHierarchical Functional Concepts for Knowledge Transfer among Reinforcement Learning Agents
This article introduces the notions of functional space and concept as a way of knowledge representation and abstraction for Reinforcement Learning agents. These definitions are used as a tool of knowledge transfer among agents. The agents are assumed to be heterogeneous; they have different state spaces but share a same dynamic, reward and action space. In other words, the agents are assumed t...
متن کاملMultiagent Reinforcement Learning for Multi-Robot Systems: A Survey
Multiagent reinforcement learning for multirobot systems is a challenging issue in both robotics and artificial intelligence. With the ever increasing interests in theoretical researches and practical applications, currently there have been a lot of efforts towards providing some solutions to this challenge. However, there are still many difficulties in scaling up the multiagent reinforcement l...
متن کاملA Survey on Multiagent Reinforcement Learning Towards Multi-Robot Systems
Multiagent reinforcement learning for multirobot systems is a challenging issue in both robotics and artificial intelligence. With the ever increasing interests in theoretical research and practical applications, currently there have been a lot of efforts towards providing some solutions to this challenge. However, there are still many difficulties in scaling up multiagent reinforcement learnin...
متن کاملTransfer Learning in Multi-Agent Reinforcement Learning Domains
Transfer learning refers to the process of reusing knowledge from past tasks in order to speed up the learning procedure in new tasks. In reinforcement learning, where agents often require a considerable amount of training, transfer learning comprises a suitable solution for speeding up learning. Transfer learning methods have primarily been applied in single-agent reinforcement learning algori...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016